Interactive calculator to compute the output shape of a convolutional layer based on the input, kernel and other settings. It supports non-square kernels, stride and padding.
Let's dive into the paper 'ImageNet Classification with Deep Convolutional Neural Networks' by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, which introduced the AlexNet to the world, and became a pivotal moment in the fields of computer vision and deep learning. The goal here is to explore in-depth the achievements, architecture, and details as a previous step toward its implementation.
-
In this post, we will go through all the elements required to create and train AlexNet following the original paper. We will cover data processing, architecture definition, coding of training and validation loops, optimizations to speed up and training. Achieving comparable results with a top-1 error rate of 39.9% and top-5 error rate of 17.7%.
Continuing our exploration of foundational deep learning models in computer vision, we will dive into the 2014 paper Very Deep Convolutional Networks for Large-Scale Image Recognition by Karen Simonyan and Andrew Zisserman, which introduced VGGNet, a set of simple yet highly performant networks. We will examine its architecture, data processing, training, testing, and analysis of the results as a preliminary step toward implementing it.
Let's code and train VGGNet from scratch! In this post, I will explain the process of implementing this iconic CNN from designing a general architecture and using dense evaluation to optimizing training speed and actually training the network to obtain a validation top-1 and top-5 error rates of 28.33% and 9.66% respectively. I will also compare the error rates and training performance against the original paper and AlexNet.
Building upon the recently implemented and trained VggNet, we will go through and implement 'A Neural Algorithm of Artistic Style' by Gatys, Ecker and Bethge which allowed to transfer the style of one image to the content of a different one. I will dive into some of the complexities of the implementation and give some beautiful examples of the results this technique can offer.